In this project, we aim to present a interactive display of obesity rates in adults across the different states in the U.S., and more pertinently, analyze how differences in the level of nutritional intake across the different states correlate with obesity rates. In addition, we also analyze the differences in the level of physical activities across the different states. Since it is widely known that one‘s diet and physical exercises play a key role in affecting one’s weight and health, we believe that states with a widespread number of fast food outlets see greater occurrence of obesity as the presence of such unhealthy food options fuel adults living in these to adopt unhealthy diets, thereby neglecting their weight and health. In particular, our analysis consist of three major components:

  1. Obesity Rate
  2. Adult Nutrition and Physical Activities
  3. Fast Food Prevalence

Dataset

  1. CDC Nutrition, Physical Activity, and Obesity data
  2. Fast Food data
  3. Twitter API data

Obesity Rate

Over Year Line Graph

From the graph, the total obesity rate increases from 27.4% to 30.1%. Moreover, the overall male obesity rate is higher than female.

Over Year Line Graph by Education

People with higher degree tend to have lower obesity rate, and the obesity rate of people with college degree is the lowest.

Over Year Line Graph by Age

This graph shows that middle-age groups from 35 to 64 tend to have the highest obesity rate, while the young adults with age from 18 to 24 have the lowest obesity rate.

Over Year Line Graph by Income

This graph shows people with higher income tend to have lower obesity rate.

State in the Percentage of Obesity in 2017

West Virginia has the highest obesity rate with the value 38.1%, while Colorado has the lowest with the value 22.6%

Physical Activities

Vermont has the highest physical activity rate with the value 59.7%, while Purto Tico has the lowest with the value 19.6%

Fast Food Restaurant

There are 676 fast food restaurants in California.

The most common fast food restaurants are McDonald’s, Burger King,and Taco Bell.

Mapping fastfood restaurants

To understand the distribution of fastfood restaurants in the US, we visualized the random sample of 10,000 fast food restaurants from the Datafiniti dataset. The graph focus on the contiguous U.S. land and shows a disporportional dense distribution of fastfood restaurant in the East and West coast than Mid America, which is understandable as coastal area has higher population density. We are interested in looking for difference between distribution among top popular restaurants.

Fast Food Restaurant Distribution for The Top 4

This plot shows McDonal’s is popular everywhere while Burger King is more popular in the north, and Taco Bell is more popular in the Mid America area with density higher in states like Illionois, Indiana, Ohio and Wisconsin.

Mapping Obesity in the US

We would like to examine the first year available (2011) data on the Obesity rate across the U.S.

The obesity rates was splited into under 25%, 25-30% and 30% up groups. In 2011, 9 states are in the green shade, meaning their obesity rate was under 25%.

However, in 2017, the situation has changed dramatically. Only one state left on the map has less than 25% obesity rate. The map is presented below.

However, in 2017, the situation has changed dramatically. Only one state left on the map has less than 25% obesity rate. The map is presented below.

Therefore We would like to examine the changes over the past period.

The grid shows the spread of obesity throughout the past 6 years. The red-shaded and blue-shaded states gradually increase and took over states with low obesity rate. In 2013, there are 6 green states left, in 2015, there is only 5. And the 2017 shows only colorado Stands as the only state with Obesity rate under 25% at 22.6%.

Obesity and Fastfood

To further explore the relationship, we mapped the distribution of fastfood restaruant and State obesity level in 2017

By clustering the distribution of fastfood restaurants, we see that states with 30% and up obesity rate, indicated by the red shade, do have heavier concentration of fastfood restaurant in Middle and East America. However, high obesity state like Alaska does not have as much Fastfood Restaurant.

Fastfood and Exercise

The finding lead to our stronger interest in explore the fastfood restaurant with other behavior. We would like to explore fastfood restaurant distributio and physical activity of the state residents.

The fastfood and exercise map shows the states in Mid/south U.S has lower activity rate, and they are located in dense fastfood restaurant area as well. Northeastern US has activity rate around 50-55%, and the fast food restaurant distribution is also dense. the existance of fastfood does not decrease people exercise rates. The west coast has higher activity rate and they have lower distribution of fastfood restaurant. Alaska has high activity rate and low fastfood restaurant distribution. The only state has activity rate under 35% is Puerto Rico, but the data set does not have infomation regarding the fastfood restaurant here.

Vegetable intake and fastfood restaurant

We would further explore whether fastfood restaurant distribution is closely related to state resident nutritional intake. We mapped the vegetable intake and the fastfood distribution.

The vegetable intake graph shows the upper northern states have less than 15 percent people eating less veggie than once daily. All east coast except New York have only 15-20 percent resident eating less than once veggie. Whereas the south west US is has around 25-30 percent people eating vegetable less than once daily. Interestingly, Puerto Rico has more than 35 percent people eating vegetable less than one a day. Alaska on the other hand, is blue-shaded with 19 percemt people eating veggie less than once every day.

Text Analysis

We collected text data from twitter related to fastfood, obesity and diet that is relevant to our project.

The barchart indicates that states like Texas, South Carolina, Montana and California tweet positively about fastfood, while states like Arizona, Colorado and Michigan tweet negatively about fastfood.

The plot of fastfood related words with the highest frequency are words related to actual food sold in fastfood chains, such as “taco”, “whooper” and “sandwich”. This supports the previous finding that the most popular fastfood chains in the U.S. are Burger King and Taco Bell.

For tweets related to obesity and diet, the word cloud shows words related to health (such as workout, body and weight), diet (such as vegan, coke and keto) and obesity related diseases such as diabetes. This suggests that people who talk about obesity and diet related topics are generally concerned about obesity and seek to improve their lifestyle and diets as obesity prevention efforts.

The two scatterplots plot the relationship between obesity rate of each of the U.S. state and the overall sentiment of the twitter text related to obesity. One would assume a positive relationship between obesity rate and sentiment towards obesity, such that states with higher occurrence of obesity have tweets that display a more positive sentiment towards obesity. The two lexicon classifications we used are AFINN and Bing. From the plots, we do not see a clear linear relationship between obesity rate and text sentiment. Instead, both plot suggests a quadratic relationship between obesity rate and text sentiment, such that states with the highest obesity rates display the more extreme sentiments, both positive and negative, toward obesity, as compared to states with low obesity rates.

The second wordcloud shows all the negative words related to obesity and diet related tweets. It seems to be that people who tweet about obesity and diet are most concerned with the adverse health complications associated with obesity. The words with the highest frequencies include kill, risk, death and cancer etc which are all extremely negative words related to health issues due to obesity.

Next, we explore if there are differences in the words that people from states with high and low obesity rates tweet. The graph of the log odds ratio of tweets from states with high obesity rate over states with low obesity rate reveals that people from states with high obesity rates are more likely to tweet highly extreme and negative words associated with obesity as compared to people from states with low obesity rate. This include words such as “risk”, “disease” and “death”. In comparison, people from states with low obesity rate tend to tweet content related to adoption of healthy lifestyles and diet as compared to people from states with high obesity rate. This include words like “read”, “calorie” and “nutrient”. This indicates that people from states with low obesity rates are more concerned with maintaining their health and weight as compared to people from states with high obesity rates.

Lastly, we visualized the network of bigrams in obesity related tweets. As expected, the network graph has two main clusters around “obesity” and “diet”. The bigrams related to obesity are mostly negative words. Some bigrams reflects the rising obesity rate in the U.S., such as “obesity prevalence”, “rampant obesity”, and others reflect the diseases associated with obesity such as “childhood obesity”, and “morbid obesity”. The diet related words are more positive, and include bigrams related to physical exercise and maintaining a healthy body image.